Full duplicate candidate pruning for frequent connected subgraph mining
نویسندگان
چکیده
Support calculation and duplicate detection are the most challenging and unavoidable subtasks in frequent connected subgraph (FCS) mining. The most successful FCS mining algorithms have focused on optimizing these subtasks since the existing solutions for both subtasks have high computational complexity. In this paper, we propose two novel properties that allow removing all duplicate candidates before support calculation. Besides, we introduce a fast support calculation strategy based on embedding structures. Both properties and the new embedding structure are used for designing two new algorithms: gdFil for mining all FCSs; and gdClosed for mining all closed FCSs. The experimental results show that our proposed algorithms get the best performance in comparison with other well known algorithms. ∗Corresponding author
منابع مشابه
Duplicate Candidate Elimination and Fast Support Calculation for Frequent Subgraph Mining
Frequent connected subgraph mining (FCSM) is an interesting task with wide applications in real life. Most of the previous studies are focused on pruning search subspaces or optimizing the subgraph isomorphism (SI) tests. In this paper, a new property to remove all duplicate candidates in FCSM during the enumeration is introduced. Based on this property, a new FCSM algorithm called gdFil is pro...
متن کاملMining for Unconnected Frequent Graphs with Direct Subgraph Isomorphism Tests
In the paper we propose the algorithm which discovers both connected and unconnected frequent graphs from the graphs set. Our approach is based on depth first search candidate generation and direct execution of subgraph isomorphism test over database. Several search space pruning techniques are also proposed. Due to lack of unconnected graph mining algorithms we compare our algorithm with two g...
متن کاملA Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs
Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...
متن کاملA new algorithm for mining frequent connected subgraphs based on adjacency matrices
Most of the Frequent Connected Subgraph Mining (FCSM) algorithms have been focused on detecting duplicate candidates using canonical form (CF) tests. CF tests have high computational complexity, which affects the efficiency of graph miners. In this paper, we introduce novel properties of the canonical adjacency matrices for reducing the number of CF tests in FCSM. Based on these properties, a n...
متن کاملA Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining
Mining frequent subgraphs from a collection of input graphs is an important task for exploratory data analysis on graph data. However, if the input graphs contain sensitive information, releasing discovered frequent subgraphs may pose considerable threats to individual privacy. In this paper, we study the problem of frequent subgraph mining (FSM) under the rigorous differential privacy model. W...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Integrated Computer-Aided Engineering
دوره 17 شماره
صفحات -
تاریخ انتشار 2010